This Quarto computational documents includes post-processing of
already clustered Xenium data generated using
process_Xenium_data.R script. This post-processing
includes
Loading required libraries
library(ggplot2, lib.loc = "/beegfs/homes/skulkarni/R/x86_64-pc-linux-gnu-library/4.3")
library(Seurat, lib.loc = "/beegfs/homes/skulkarni/R/x86_64-pc-linux-gnu-library/4.3")
library(patchwork)
library(markdown)
library(rmarkdown)
library(dplyr, lib.loc = "/beegfs/homes/skulkarni/R/x86_64-pc-linux-gnu-library/4.3")
library(presto, lib.loc = "/beegfs/homes/skulkarni/R/x86_64-pc-linux-gnu-library/4.3")
Read in stored Xenium objects for both regions for 0027119
region1 <- readRDS("/prj/XeniumProbeDesign/kidney_Nphs2-mice_Xenium_Martin/xenium_objects_R/output-XETG00046__0027119__Region_1__20240621__120943_processed.rds")
region2 <- readRDS("/prj/XeniumProbeDesign/kidney_Nphs2-mice_Xenium_Martin/xenium_objects_R/output-XETG00046__0027119__Region_2__20240621__120943_processed.rds")
Dimplots of both regions
p1 <- DimPlot(region1) + labs(title = "Region 1")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
p2 <- DimPlot(region1) + labs(title = "Region 2")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
p1 | p2
As DimPlots of region1 and region2 are not so different + they seem to share many marker genes, the cell-type annotations will be performed on 1 region and will be used for the second region as well.
A separate R script was used to perform merging of two regions, as memory was running out. This R script was submitted as a SLURM job. Output of this script is a processed merged Xenium objects that contain both regions. Following operations have been performed during above processing: * Merging * Normalising and scaling data * Re-computing PCA and UMAP * Finding neighbors * Finding clusters
In next steps, cluster annotation will be performed. #### Marker gene analysis Read in the RDS merged object created by above R script
xenium.obj.0027119 <- readRDS("/prj/XeniumProbeDesign/kidney_Nphs2-mice_Xenium_Martin/xenium_objects_R/0027119_merged_processed.rds")
Finding marker genes using presto package (much faster than Seurat functions)
all_markers_0027119 <- presto::wilcoxauc(xenium.obj.0027119, seurat_assay = "SCT", group_by = "seurat_clusters")
write.csv(all_markers_0027119, "/prj/XeniumProbeDesign/kidney_Nphs2-mice_Xenium_Martin/xenium_objects_R/0027119_merged_markergenes.csv", quote = F)
For selection of marker genes per cell-type, I looked into multiple papers publishing scRNA-seq mouse kidney. However, after trying multiple ones, following paper had the most overlapping cell-types https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10690238/#SD1
For the sake of saving time to load featureplot for every gene, only one gene plot is shown
Function to plot expression levels on UMAP and spatial level
plot_featureplots <- function(gene){
p1 <- FeaturePlot(xenium.obj.0027119, c(gene), label = T) + labs(title = "On UMAP Level")
p2 <- ImageFeaturePlot(xenium.obj.0027119, features = gene, dark.background = F) + labs(title = "On Spatial Level")
return(p1 | p2)
}
Podocytes markers -> Nphs2, Ddn, Clic3 and Rab3b
plot_featureplots("Nphs2")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Cluster 9 and 14 are podocytes.
PTS1/2/3 – proximal tubule segment 1 -> Slc22a8, Lrp2, Cyp4b1 2 -> same 3 -> Aadat, Aqp1
plot_featureplots("Slc22a8")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
plot_featureplots("Aqp1")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Cluster 0, 20, 28, 2, 21, 15, 26, 36, 11, 30, 31, 24, 35, 1, 34 are proximal tubule segment clusters 1, 2and 3.
thin limb of the loop of Henle (LOH) associated cells -> Bst1, Aqp1, Cryab, Pax8, Upk3b
plot_featureplots("Bst1")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Cluster 17 is LOH cells
Thick ascending limb of the loop of Henle (TAL) TAL1 -> Sostdc1, Gpx6, Ppargc1a, Prox1 TAL2 -> ““Slc5a1”“,”Dusp15” + TAL1 TAL3 -> “Gpx6”,“Scin”
plot_featureplots("Ppargc1a")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Clusters 4, 8, 10, 25, 33 are TAL clusters
Distal convoluted and connecting tubule (DCT-CNT) -> Calb1, Hsd11b2, Ldhb
plot_featureplots("Hsd11b2")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Clusters 5 and 29 are DCT-CNT cells
Principal cells (PC) -> “Fxyd4”, “Hsd11b2”, “Aqp3”
plot_featureplots("Fxyd4")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Intercalated cells (IC) -> “Slc4a1”, “Car2”
plot_featureplots("Slc4a1")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Both IC and PC cells overlap with DCT-CNT cells. So clusters 5 and 29 are DCT, IC and PC cells.
Urothelium -> Upk1b, Upk3b
plot_featureplots("Upk1b")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Cluster 22 is urothelium cells
Fibroblasts -> Fbln5, Angptl2, “Mylk”, “Col4a2”
plot_featureplots("Fbln5")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Clusters 6, 12, 13 and 16 are fibroblasts
Pericytes -> Myh11, Ren1, Gja5
plot_featureplots("Fbln5")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Cluster 8 is Pericytes
Vascular cells -> Plvap, Kdr
plot_featureplots("Fbln5")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Cluster 3 is vascular cells
Macrophages -> “C5ar1”, “Cybb”, “Mpeg1”
plot_featureplots("Cybb")
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
Cluster 7 is macrophages
Dendritic cell genes were not present in the gene panel so not found. T cells were also not found in this dataset.
Rename all clusters.
new.cluster.ids <- c("PT", "PT", "PT", "Vascular", "TAL", "DCT-IC-PC", "Fibroblasts", "Macrophages", "TAL",
"Podocytes", "TAL", "PT", "Fibroblasts", "Fibroblasts", "Podocytes", "PT", "Fibroblasts",
"LOH", "Unidentified1", "Injured cells", "PT", "PT", "Urothelium", "Unidentified2", "PT", "TAL", "PT", "Unidentified3", "PT",
"DCT-IC-PC", "PT", "PT", "PT", "TAL", "PT", "PT", "PT")
names(new.cluster.ids) <- levels(xenium.obj.0027119)
xenium.obj.0027119 <- RenameIdents(xenium.obj.0027119, new.cluster.ids)
DimPlot after changing cluster names
DimPlot(xenium.obj.0027119, label = T)
## Rasterizing points since number of points exceeds 100,000.
## To disable this behavior set `raster=FALSE`
ImagePlot after changing cluster names
ImageDimPlot(xenium.obj.0027119, dark.background = F)
## Warning: No FOV associated with assay 'SCT', using global default FOV